2023 Research Projects
Projects are posted below; new projects will continue to be posted. To learn more about the type of research conducted by undergraduates, view the archived symposium booklets and search the past SURF projects.
This is a list of research projects that may have opportunities for undergraduate students. Please note that it is not a complete list of every SURF project. Undergraduates will discover other projects when talking directly to Purdue faculty.
You can browse all the projects on the list or view only projects in the following categories:
Big Data/Machine Learning (29)
4-dimensional ultrasound assessment of cardiac remodeling during pregnancy and postpartum lactation
- Biomedical Engineering
- Computer Engineering
More information: https://engineering.purdue.edu/cvirl
AAMP-UP Project 3: Machine Learning and Data Collection
This project is from the AAMP-UP summer program, which is a different program than SURF. AAMP-UP is a 10-week summer program that provides STEM undergraduates the chance to participate in national defense and military research. The program is sponsored by the U.S. Army Research Laboratory in Aberdeen, MD.
- No Major Restriction
More information: https://engineering.purdue.edu/Energetics/AAMP-UP/index_html
Artificial Intelligence for Industrial Systems
- No Major Restriction
More information: https://engineering.purdue.edu/CYNICS
Artificial Intelligence for Manufacturing in Practice
- Electrical Engineering
- Mechanical Engineering
- Industrial Engineering
- Computer Science
- Computer Engineering
Artificial Intelligence for Music and Art
- Computer Engineering
- Computer and Information Technology
- Computer Science
- Music
- Data Science
Data Free Model Extraction
*** Possible industry involvement: Some of these projects are funded by Meta/Facebook research awards and J.P.Morgan AI research awards. *** We especially encourage applications from women, Aboriginal peoples, and other groups underrepresented in computing.
*** Project 1. Data-Free Model Extraction
Many deployed machine learning models such as ChatGPT and Codex are accessible via a pay-per-query system. It is profitable for an adversary to steal these models for either theft or reconnaissance. Recent model-extraction attacks on Machine Learning as a Service (MLaaS) systems have moved towards data-free approaches, showing the feasibility of stealing models trained with difficult-to-access data. However, these attacks are ineffective or limited due to the low accuracy of extracted models and the high number of queries to the models under attack. The high query cost makes such techniques infeasible for online MLaaS systems that charge per query.
In this project, we will design novel approaches to get higher accuracy and
query efficiency than prior data-free model extraction techniques.
Early work and background can be found here:
https://www.cs.purdue.edu/homes/lintan/publications/disguide-aaai23.pdf
*** Project 2. Language Models for Detecting and Fixing Software Bugs and Vulnerabilities
In this project, we will develop machine learning approaches including code language models to automatically learn bug and vulnerability patterns and fix patterns from historical data to detect and fix software bugs and security vulnerabilities. We will also study and compare general code language models and domain-specific language models.
Early work and background can be found here:
Impact of Code Language Models on Automated Program Repair. ICSE 2023. Forthcoming.
KNOD: Domain Knowledge Distilled Tree Decoder for Automated Program Repair. ICSE 2023. Forthcoming.
https://www.cs.purdue.edu/homes/lintan/publications/cure-icse21.pdf
https://www.cs.purdue.edu/homes/lintan/publications/deeplearn-tse18.pdf
*** Project 3. Inferring Specifications from Software Text for Finding Bugs and Vulnerabilities
A fundamental challenge of detecting or preventing software bugs and vulnerabilities is to know programmers’ intentions, formally called specifications. If we know the specification of a program (e.g., where a lock is needed, what input a deep learning model expects, etc.), a bug detection tool can check if the code matches the specification.
Building upon our expertise on being the first to extract specifications from code comments to automatically detect software bugs and bad comments, in this project, we will analyze various new sources of software textual information (such as API documents and StackOverflow Posts) to extract specifications for bug detection. For example, the API documents of deep learning libraries such as TensorFlow and PyTorch contain a lot of input constraint information about tensors. Language models may be explored.
Early work and background can be found here:
https://www.cs.purdue.edu/homes/lintan/projects.html
*** Project 4. Testing Deep Learning Systems
We will build cool and novel techniques to make deep learning code such as TensorFlow and PyTorch reliable and secure. We will build it on top of our award-winning paper (ACM SIGSOFT Distinguished Paper Award)!
Machine learning systems including deep learning (DL) systems demand reliability and security. DL systems consist of two key components: (1) models and algorithms that perform complex mathematical calculations, and (2) software that implements the algorithms and models. Here software includes DL infrastructure code (e.g., code that performs core neural network computations) and the application code (e.g., code that loads model weights). Thus, for the entire DL system to be reliable and secure, both the software implementation and models/algorithms must be reliable and secure. If software fails to faithfully implement a model (e.g., due to a bug in the software), the output from the software can be wrong even if the model is correct, and vice versa.
This project aims to use novel approaches including differential testing to detect and localize bugs in DL software (including code and data) to address the testing oracle challenge.
Early work and background can be found here: https://www.cs.purdue.edu/homes/lintan/publications/eagle-icse22.pdf https://www.cs.purdue.edu/homes/lintan/publications/fairness-neurips21.pdf https://www.cs.purdue.edu/homes/lintan/publications/variance-ase20.pdf https://www.cs.purdue.edu/homes/lintan/publications/cradle-icse19.pdf
- Computer Science
- Computer Engineering
- software engineering
More information: https://www.cs.purdue.edu/homes/lintan/
Development of protein biomarkers from biofluids for non-invasive early detection and monitoring of cancers
- Computer Science
- Biochemistry
- Biomedical Engineering
- Chemistry
- Biology
More information: http://www.protaomics.org/
Development of single cell pathway analysis benchmark
Single cell pathway analysis typically involves the use of single cell omics technologies such as single cell transcriptomics (scRNA-seq), single cell proteomics, or single cell epigenetics. These techniques provide a high-throughput and comprehensive view of the molecular changes taking place within individual cells.
Applications of single cell pathway analysis include the study of development, disease, and cellular signaling. For example, it can be used to uncover the complex molecular changes that occur during cell differentiation and the progression of diseases such as cancer. It can also be used to study the effects of drugs and other treatments on individual cells.
there has been multiple methods developed to perform single cell analysis, however, how well these methods perform remains unclear. The aim of this project is developing a benchmark to evaluate various single cell pathway analysis methods.
- No Major Restriction
More information: https://kazemianlab.com
EMBRIO Institute - High resolution imaging (project 1) and computational modeling (project 2) to test decoding of Ca2+-flux frequency by CaM and CaMKII role in dynamic actin polymerization and dendritic spine morphology.
Project 2: This summer research project will use computational modeling of Ca2+/Calmodulin and CaMKII interactions in dendritic spines to test the hypothesis that decoding of Ca2+-flux frequency by CaM and CaMKII plays a major role in dynamic actin polymerization and dendritic spine morphology. Computational tools that will be used include ordinary and partial differential equations and machine learning techniques to rapid explore model parameter space.
Research Question Overview:
Neuronal synapses are tightly regulated intercellular junctions that rapidly convey information from an upstream pre-synaptic neuron to a downstream post-synaptic neuron. Dynamic strengthening or weakening of synaptic connective strength, known as synaptic plasticity, is a critical feature of neuronal function. The direction of synaptic plasticity (increased connective strength (LTP) versus decreased connective strength (LTD)) depends on the timing of action potentials (AP), which is translated into frequency signals of Ca2+ ion flux through NMDA
receptors (NMDAR) located on dendritic spines (100-500nm mushroom-like protrusions that form the post-synapse).
The timing and direction of synaptic plasticity is also exquisitely regulated by dynamic organization and spatial localization of synaptic adhesion molecules, signaling receptors, ion channels, and the intracellular cytoskeleton within spines. However, it not clear to how these electrical, biochemical, and mechanical cues are integrated to produce robust, repeatable, and highly dynamic synaptic plasticity that lasts over the lifetime of a neuron (decades). Our recent work has shown that competition for CaM-binding can influence the Ca2+ frequency-dependence of protein activation and downstream signaling. In particular, the highly expressed Ca2+/calmodulin-dependent kinase II (CaMKII) plays a key role in synaptic plasticity via two
important aspects of its function: (1) CaMKII is highly involved in Ca2+-dependent signal transduction via phosphorylation of a number of downstream proteins including ion channels, guanine nucleotide exchange factors (GEFs), GTPase activating proteins (GAPs), and transcription factors, and (2) CaMKII acts as a multivalent scaffold that binds multiple proteins simultaneously and localizes them to post-synaptic spines, including both filamentous and monomeric actin and may regulate actin polymerization in the spine.
- No Major Restriction
More information: https://www.purdue.edu/research/embrio/research/index.php
EMBRIO Institute - Mechanistic models of Calcium signaling and its downstream effects
- No Major Restriction
Localized Deep Learning for Decentralized and Dynamic Environments
REU participants will be part of a collaborative team focused on developing novel localized deep learning approaches. One particular target project is a novel localized deep learning approach that we have named a Minimal Learning Unit (MLU). The goal is to create a learning algorithm with local objectives that learns rich unsupervised representations in a highly decentralized and fault-tolerant way. As one specific context, suppose a sensor network should be trained to detect a complex or global event such as anomalous activity over a large area of the wilderness. Each sensor has a very incomplete picture of the situation and can communicate with nearby sensors but cannot communicate with a global centralized server. The goal is to implement both width-parallel and depth-parallel learning on an unreliable set of sensor devices that have limited compute power. This project will focus on the fundamental aspects of novel local learning mechanisms in this highly decentralized environment.
- No Major Restriction
Mobility Evolution in the US: Evidence from Bike-sharing and Electric Vehicle Adoption
- No Major Restriction
More information: https://engineering.purdue.edu/STSRG; https://engineering.purdue.edu/ASPIRE
Model and control strategy development to modernize the pharmaceutical tablet manufacturing process
In a dry granulation tableting line, the powders are transformed into granules before being compressed into tablets. The granulation step can increase the powder flowability by enlarging particle size and improving the powder blend's content uniformity by minimizing segregation. The goals of this project include (1) investigating the impact of granulation on final tablet qualities and building high-fidelity models using first principles and machine learning, and (2) developing soft sensors to predict critical quality attributes such as tensile strength in real time. (3) Implementing model-based process control strategy to control end-to-end pharmaceutical manufacturing processes. All the research works will be conducted in Purdue's newly installed tablet manufacturing pilot plant at the FLEX Lab in Discovery Park.
- No Major Restriction
Modernization of Pharmaceutical Drug Product Manufacturing
In this project, we will investigate the ribbon splitting phenomenon in a roller compactor, which is a phenomenon can adversely affect that quality of the product granules coming out of the roller compactor. Little is known about its impact on the product quality as well as the predictability of the phenomenon. The ability to predict this phenomenon can be a boon to effective implementation of condition-based maintenance strategies that have been accepted to be a critical requirement for the successful shift to continuous pharmaceutical manufacturing. This study requires particle technology expertise, which will be provided by Prof. Marcial Gonzalez in Mechanical Engineering, as well as process systems engineering expertise provided by Prof. Rex Reklaitis and Prof. Zoltan Nagy in Chemical Engineering.
- No Major Restriction
Molecular microscopy to inform the design of medications
- No Major Restriction
More information: http://www.chem.purdue.edu/simpson/
Physics-Informed Machine Learning to Improve the Predictability of Extreme Weather Events
Traditionally, prediction of extreme weather events is based on direct numerical simulation of regional or global atmospheric models, which are expensive to conduct and involve a large number of tunable parameters. However, with the rapid rise of data science and machine learning in recent years, this proposed work will apply convolutional neural network to an idealized atmospheric model to conduct predictability analysis of extreme weather events within this model. With this proposed machine-learning algorithm, our project will provide a robust forecast of heat waves and atmospheric blocking with a lead-time of a few weeks. With more frequent record-breaking heat waves in the future, such a prediction will offer a crucial period of time (a few weeks) for our society to take proper preparedness steps to protect our vulnerable citizens.
This project is based on developing and verifying the machine learning algorithm for detecting extreme weather events in an idealized model. We will use Purdue’s supercomputer Bell to conduct the simulations. The undergraduate student will play an active and important role in running the idealized model, and participate in developing the algorithms. As an important component of climate preparedness, the proposed work aims to develop a physics-informed machine learning framework to improve predictability of extreme weather events.
Closely advised by Prof. Wang, the student will conduct numerical simulations of an idealized and very simple climate model, and use python-based machine learning tools to predict extreme weather events within the model. Prof. Wang will provide weekly tutorial sessions to teach key techniques along with interactive hands-on sessions. The students will get access to the big datasets on Purdue’s Data Depot, analyze and visualize data of an idealized atmospheric model. The student will use convolutional neural networks (CNNs) to train and assess a Machine-Learning model. The student will further use feature tracking algorithm to backward identify the physical structure in the atmosphere that is responsible for the onset of extreme weather events.
- No Major Restriction
Quantum Characterization Setup Software Development
- No Major Restriction
Quantum Characterization Setup Software Development
- No Major Restriction
RCAC Anvil REU Internship (x6)
1. Data analytics: Instrument and perform analysis of scientific application workloads on the Anvil system
2. High Performance Computing (HPC): Extend the Anvil system to burst scientific workflows into the Microsoft Azure cloud
3. Kubernetes: To support science gateways applications, extend Anvil’s Kubernetes-based composable subsystem to use cloud-based Kubernetes clusters
4. Containers to Support Education: Enable the use of large-scale notebook deployments to provide interactive access to Anvil in support of education
Applicants must be U.S. citizens. Open to undergrad students from all backgrounds.
- No Major Restriction
More information: https://www.rcac.purdue.edu/anvil/reu
Rapid characterization of high temperature alloys
This SURF project aims to characterize the strength and oxidation resistance of tens to hundreds of refractory alloys using high-throughput characterization methods. Such methods for this project could include: Raman microscopy, surface profilometry, X-ray diffraction, automated scanning electron microscopy, and indentation. As part of this project, you will learn at least two of these methods and apply them to compositionally graded specimens comprising up to 85 unique alloys - potentially encompassing thousands of unique alloy compositions.
Significant data will be collected during this project, and the data must be collected and stored according to the FAIR principles (Findability, Accessibility, Interoperability, Reuse). Thus, some background in Python programming and Excel is desired for this project. It is expected that at the end of this project, you will publish a publicly accessible NanoHub.org tool that enables users from across the world to access and analyze the data.
- Materials Engineering
- Chemical Engineering
- Mechanical Engineering
- Physics
Real-Time Measurements of Volatile Chemicals in Buildings with Proton Transfer Reaction Mass Spectrometry
- No Major Restriction
More information: https://www.purdue.edu/newsroom/stories/2020/Stories%20at%20Purdue/new-purdue-lab-provides-tiny-home-for-sustainability-education.html
SCALE: Optimizing MXene properties
Most of the materials we encounter in our daily lives are ‘bulk’ materials – they contain an enormous number of atoms in all three dimensions. However, if we instead consider materials with one dimension of only a few atoms in thickness, like graphene, we can achieve many unique physical and chemical properties unique from their bulk counterparts. For example, 2D magnetic materials have drawn significant attention because of their application in spintronics and quantum computing. One class of 2D materials with the potential to serve as the first room-temperature 2D magnets are MXenes, near atomically thin transition metal carbides or nitrides. For a magnetic material, the configuration can be ferromagnetic (FM) or antiferromagnetic (AFM) depending on the direction of spins of electrons. Using electronic structure calculations based on density functional theory (DFT), we can identify the magnetic configuration with lower energy. Further, the critical temperature, e.g. Curie temperature, is the temperature above which the material loses the spontaneous magnetization. For real-world applications, magnetic materials with a critical temperature that is higher than room temperature are desired. This project will combine DFT calculations to discover magnetic MXenes with high Curie temperatures.
In your application, please specify which of the SCALE technical areas you are most interested in. The technical areas are:
• Radiation Hardening
• System-on-Chip
• Heterogenous Integration/ Advanced Packaging
• Program Evaluation
Be sure to name any specific SCALE projects you are interested in, and include information about how you meet the required and desired experience and skills for each of these projects.
For US citizen students who are interested: you can become part of the Purdue microelectronics program called SCALE, sponsored by the Department of Defense. In SCALE, you will have opportunities for continuing research (paid or for credit) and industry and government internships throughout your time at Purdue. Please apply to SCALE here: https://research.purdue.edu/scale/.
- No Major Restriction
More information: https://www.strachanlab.org
SCALE: Strain effect on properties of 2D MXene materials
2D materials are a class of crystalline solids with a single layer only a few atoms thick. Because of their ultrathin body, 2D materials possess unique physical and chemical properties that are usually not seen in their bulk counterparts. Nowadays, 2D materials have been widely applied in solar cells, memory devices, chemical sensors. One emerging subset of the 2D materials class is MXenes, a new type of 2D material that has been successfully synthesized and studied in the last decade. MXenes are defined by a transition metal carbide or nitride with only atomically thin layers. The properties of a specific MXene are not always suitable for a given application, and one way to tune their properties is to apply strain. The mechanical strain has effects on the electronic and magnetic properties of materials because the strain changes the crystal structure of the materials. For example, the band gap of a material is an important property for electronic applications, and studies have shown that for some 2D materials, biaxial tensile strain decreases the band gap. Different strains, including biaxial, uniaxial, tensile, and compressive, also each have a different effect on the properties. In this project, the strain-tuned electronic and magnetic properties of novel MXenes will be studied. The physical mechanism behind the strain-induced properties will be characterized based on the change of crystal structures.
In your application, please specify which of the SCALE technical areas you are most interested in. The technical areas are:
• Radiation Hardening
• System-on-Chip
• Heterogenous Integration/ Advanced Packaging
• Program Evaluation
Be sure to name any specific SCALE projects you are interested in, and include information about how you meet the required and desired experience and skills for each of these projects.
For US citizen students who are interested: you can become part of the Purdue microelectronics program called SCALE, sponsored by the Department of Defense. In SCALE, you will have opportunities for continuing research (paid or for credit) and industry and government internships throughout your time at Purdue. Please apply to SCALE here: https://research.purdue.edu/scale/.
More information: https://www.strachanlab.org
Searching for bound top quark states in the CMS proton-proton collision data from the Large Hadron Collider
Candidates will be able to use a vast sample of top quark data, literally 100's of millions of top quark to search for any evidence of new particles. The Jung group even uses quantum computers to boost efficiency for reconstructing events and participants can have a choice in the direction and emphasis of the research project to the limits of what is possible. Students will contribute to the review process of analysis and publication and have a chance to be author for publication of technical/algorithm side or even for physics publications (provided contributions are above required threshold), provided sustained and multiple semester engagement.
More information: https://www.physics.purdue.edu/jung/
Super-Resolution Optical Imaging with Single Photon Counting and Optomechanics with Nanostructured Membranes
- Electrical Engineering
- Mechanical Engineering
- Physics
- Biomedical Engineering
Toward Calibration of Cognitive Factors (Trust, Self-Confidence, Risk) for Enhancing Human Interaction with Automation
- No Major Restriction
Using Machine Learning to Discover Perovskite Photocatalysts
Targeted Need: Challenges of environmental pollution, global energy shortage, and overreliance on fossil fuels can be addressed using photocatalysis, where solar energy is harnessed for chemical processes such as hydrogen production, degradation of pollutants, and CO2 reduction [1]. Many semiconductors have been used as photocatalysts based on suitable band edge positions relative to redox potentials, strong optical absorption, and desirable adsorption and desorption of chemical species; examples include TiO2, Ga2O3, C3N4, CdS, and ZnS [2]. However, many limitations exist owing to wider than desired band gaps, ineffectiveness of charge carriers, and formation of harmful defects, motivating the search for novel and improved materials. Cheap and high-performing photocatalysts can also help avoid the use of transition or precious metals such as Pt and Pd as catalysts [3]. The chemical space of potential semiconductor photocatalysts is massive and not conducive to brute-force experimentation or even computation, which necessitates the use of data-driven strategies combining large computational datasets and state-of-the-art machine learning [4], prior to experimental validation and discovery.
Opportunity: Metal halide perovskites (HaPs) have risen in prominence for solar and related optoelectronic applications, and are suggested as promising photocatalysts. Recent publications report the use of MAPbI3, MAPbBr3 (MA=methylammonium), CsPbI3, Cs2BiAgBr6, and other single/double inorganic/hybrid perovskites, either in bulk crystalline form, 2D variants, nanoclusters, or as part of heterostructures, for water splitting, CO2 reduction, and organic synthesis [1,2]. However, this field remains very much in its infancy—HaPs are desirable photovoltaic (PV) materials with extremely tunable properties, but an exhaustive study of band edges, surface energies, and adsorption behavior across a wide chemical space is missing. Using high-throughput density functional theory (HT-DFT) computations, our research group has developed an initial dataset of the stability, band gap, and optical absorption characteristics of ABX3 HaPs with mixing at A, B, or X sites using common elemental or molecular species [5]. This provides the starting point for exploring photocatalytic activity of HaPs as a function of composition, phase, and surface orientation, by combining HT-DFT with machine learning (ML). Since DFT computations are expensive and cannot be performed endlessly, ML models trained on DFT data can help predict optical, electronic, surface, and adsorption properties of millions of new perovskite compositions, to accelerate by several orders of magnitude the screening of novel HaPs with a suitable combination of properties for catalyzing reactions.
Objectives: In this project, a HT-DFT+ML prediction, screening, and design approach will be applied to discover novel HaP compositions that display desired stability, optical absorption, surface stability, and activity towards species, for next-generation photocatalysis of technologically-important chemical processes, including CO2 reduction, H2 and O2 evolution (water splitting), and synthesis of various hydrocarbons. Specific objectives include: (i) using the existing DFT dataset of HaP crystal structures to build surface slabs, calculate surface energies, and adsorption energies of various molecules on stable surfaces, (ii) unique encoding of each material (descriptors) in terms of structure, composition, surface atoms, adsorbing species, etc. [4], and (iii) training of ML models based on regression techniques such as random forests and neural networks, ensuring rigorous optimization of hyperparameters, training data size, input dimensions, and applicability towards any new data point.
Role of Student Researcher: Using our available codes, software, and computing resources, students can quickly start running and analyzing simulations of photocatalytic properties. A variety of existing schemes can be applied and tested for numerical representation/description of materials and property prediction, such as using graph convolutional neural networks (GCNNs) for automatic crystal structure representation, which our group has good experience with. Student will carry out DFT and ML tasks under the guidance of a graduate student and the professor, and will be given the opportunity to lead one or two potentially high-impact journal publications. Given the prior work that has gone into this project, chances of success are very high, and future prospects will be plenty.
References
1. J. Yuan et al., Nanoscale, 13, 10281 (2021).
2. K. Ren et al., Journal of Materials Chemistry A, 10, 407 (2022).
3. Z. Luo et al., Nature Communications, 11, 4091 (2020).
4. J. Schmidt et al., npj Computational Materials, 5, 83 (2019).
5. A. Mannodi-Kanakkithodi et al., Energy and Environmental Science, 15, 1930-1949 (2022).
- No Major Restriction
More information: https://www.mannodigroup.com/
Using network science for precision learning intervention
- No Major Restriction
Vaginal Microbiome Regulation of Progesterone Signaling
- No Major Restriction